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Abstract. The number of "nonequivalent" Huffman codes of length r over an alphabet of size 
t has been studied frequently. Equivalently, the number of "nonequivalent" complete i-ary trees 
has been examined. We first survey the literature, unifying several independent approaches to 
the problem. Then, improving on earlier work we prove a very precise asymptotic result on the 
counting function, consisting of two main terms and an error term. 



I. Introduction 

1.1. A problem in coding theory. Let a source 5* emit r words w\, . . . , w r with probabilities 
Px,...,p r respectively. Here < pi < 1 and J2i=iPi = 1- For each word Wi we assign a code 
word Ci = Ciiwi) over an alphabet of size t. Let U denote the length of the codeword Cj. For a 
given source S, a compact code minimises the average length 7 = Y^i=i Pth- Huffman [16j showed 
how to construct a code with minimum average word length, given the word probabilities p^. 
These Huffman codes are prefix-free, and can therefore be decoded instantaneously. Moreover 
these codes can be found efficiently 

The Kraft-McMillan inequality states: For an alphabet of size t and a source that emits r 
words, a necessary and sufficient condition for the existence of an instantaneous code with code 
word lengths l\, . . . , l r is that 

1 < i. (i.i) 



E 



Moreover, for the existence of a uniquely decipherable code inequality (1.1) is necessary. 
Let us call a code compact if it satisfies the Kraft equality: 

i=i 

When multiplying the equation by t lr we observe that in a compact code the number of code- 
words of maximal length l r is divisible by t. Also, if there are two distinct codewords starting 
with the same prefix ai . . . a q but then continuing differently, ai . . . a q bi . . . and ai . . . a q b 2 ■ ■ ., 
then all t possible symbols must occur at position q + 1. In other words, if a sequence branches, 
it branches into all t possible directions. This is the reason why it is possible to model the 
situation by means of a rooted t-ary tree, which we do below. As it is possible to arrive from a 



given Huffman code at a solution of equation (1.2), and vice versa, to arrive from a solution to 



this equation at an admissible Huffman code it is natural to consider all Huffman codes with 
the same set of word lengths as "equivalent" codes. 
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Example: Let t = 3. Let the code consist of the codewords: 

00, 010, Oil, 012, 02, 1, 20, 21, 220, 221, 222. 
The code can be nicely represented by the tree in Figure [T] 

Figure 1. Rooted tree corresponding to the code {00, 010, 011, 012, 02, 1, 20, 21, 220, 221, 222}. 




010 011 012 220 221 222 

Below we list a number of alternative ways of defining our main object. This reflects that 
the same type of question has been studied from various points of view, often without being 
aware of the corresponding results expressed in a different mathematical language. 

We use Kraft's equality as the basis for our first definition. It stresses the number theoretic 
properties and was at the origin of the Boyd's |5J work. 

Definition 1 (Number theoretic definition). Let f t (r) denote the number of solutions of the 
equation 

r i 

i=l 1 

where the xi are nonnegative integers and < x\ < ■ ■ ■ < x r . 

For more information on other counting functions related to representations of one as a sum 
of unit fractions, see and [8]. 

Collecting the number of words of the same length (corresponding to Xi in the last definition), 
one arrives at an alternative definition: From our point of view, all codes with the same number 
of words of a given length are equivalent. This suggests the following definition: 

Definition 2 (Huffman sequences). Let t > 2 and r > 1 be positive integers. Let ft{r) denote 
the number of sequences of non-negative integers 

i i 

(ao,Oi,...,oj), l>0,ai>0, ^a i = r, ^ = 1. 

i=0 i=0 

1.2. Rooted trees. Let us recall some vocabulary from graph theory: A rooted tree is a 
connected cycle free graph, with one vertex being distinguished (root). (We will draw it on 
the top, all other vertices below). We say the tree is t-ary, if all those vertices, which are not 
the root, are either a leaf, that is an end of a path from the root, or have one predecessor and 
t children. All non-leaves are called inner vertices. Note that the root is also an inner vertex 
unless for the trivial tree of order one. In other words, for the trees we consider, the root has 
degree t, all other vertices either have degree 1 (leaf) or have degree t + 1. 

Definition 3 (Canonical rooted tree). A rooted tree is called canonical if its corresponding prefix 
code has the property that the lexicographic ordering of its words corresponds to a nondecreasing 
ordering of the word lengths. 
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Let us say that two rooted t-ary trees are equivalent, if their number of leaves at distance i 
from the root is the same, for all i. Let ft(r) denote the number of equivalence classes of t-ary 
rooted trees with exactly r leaves. 

Note that each equivalence class contains exactly one canonical tree. Also, if the tree has 
dj leaves at distance i from the root, then ^ = 1. This follows inductively, since a leaf 
at distance i from the root, i.e. which contributes a weight i can be split into t children at 
distance i + 1, of weight each. As these rooted t-ary trees correspond to a compact code, 
we also call these trees "compact trees" . 

Using Definition [3] one would for example replace the code 

{00, 010, Oil, 012, 02, 1, 20, 21, 220, 221, 222} 
by the following equivalent code: 

{0, 10, 11, 12, 20, 210, 211, 212, 220, 221, 222}. 
The corresponding canonical rooted tree is in Figure [2j In our usual way of drawing these 

Figure 2. Canonical tree, corresponding to {0, 10, 11, 12, 20, 210, 211, 212, 220, 221, 222}. 



210 211 212 220 221 222 



diagrams, a canonical tree therefore has the longer paths as far to the right hand side as possible. 

1.3. A problem on bounded degree sequences. The number a, of code words of length i, 
or leaves at level i is of course bounded above by t % . But there is no absolute bound on 
Let us study another sequence instead, namely b\ — 1, hi — tbi-\ — see Komlos, W. Moser 
and Nemetz [20] and Flajolet and Prodinger [TTJ. The problems of counting these sequences 
are equivalent to the earlier counting problem. For these sequences the ratios r^- are bounded, 
which is why one may call these sequences "bounded degree sequences". Flajolet and Prodinger 
[IT] used this definition when they counted level number sequences of trees. 

Definition 4 (Bounded degree). Let t > 2,r > 2 be integers. Let ft{r) denote the number of 
sequences 



(&!,...,&,), 1>1, h = l, 1 < &i < (i = 2,. ..,/), J2 bi = 




For convenience we will later also use gtiji) = f t (l + n(t — 1)). (Here, one can think ofn = j- ). 

A bijection between the last two definitions is as follows: Given a canonical tree, we set hi 
to be the number of inner vertices at height % — \. Observe that the 6j inner vertices guarantee 
that there are at most tbi vertices of any type (inner vertices or leaves) on the next level. 

A very similar definition is due to Even and Lempel [5]. 
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Definition 5 (Proper words). Let t > 2 and n > 1 be integers. A word U\...u n over 
the alphabet {0, 1} is said to be a proper word, if it can be written in the form u\...u n = 
c °10 Cl l . . . O^-nO 01 such that c = and < c i+1 < ta + t - 1 holds for all < % < I - 1. 

Note that the sequence c, describes the lengths of the runs of consecutive zeros. We note 
also that from the representation as a word of length n, we immediately get Yl\=i Ci = n — I. 

To see that Definition [5] is equivalent to Definition [4j we simply note that the relations 
bi + i = Ci + 1 and n = |5f induce a bijection between the objects counted in the two definitions. 
Even and Lempel [9] also give a combinatorial interpretation of this bijection (for t = 2, but 
the generalisation is straight-forward): essentially, for each 1 in a proper word, they replace a 
leaf of maximum height by an inner vertex with t leaves as successors; for each 0, they replace 
a leaf of second-most height by an inner vertex with t leaves as successors. 

We briefly mention some further approaches which investigate equivalent sequences. Working 
on a different problem, Mine (22] reduced it to the study of a binary bounded degree sequence, 
Definition [4] above. Let A be a free commutative entropic cyclic groupoid. The number of 
elements of A of a given degree turns out to satisfy the relation above. (For a full description 
we must refer to [22]). The condition in Definition [1] looks like a special partition function. 
Andrews [2] expanded on Mine's work, in particular studying generating functions. 

A further problem, on lambda algebras A p , has been related to these sequences, see Tangora 



1.4. An example. As an example for these various definitions, let us compute /2(5) = 3 in 
the different forms. Using Definition [TJ 

111 1 11111111111 

~2 + 4 + 8 + 16 + 16~2 + 8 + 8 + 8 + 8~4 + 4 + 4 + 8 + 8 

is a complete list of all solutions. 

Counting Huffman sequences (Definition [2]) we count (a ,ai, . . .) where dj is the number of 
occurrences of the fraction > 1. Here with t — 2 these sequences are: 

(0,1, 1,1,2), (0,1,0,4), (0,0,3,2). 

Let us explicitly write down the compact Huffman codes. 

Ci = {0, 10, 110, 1110, 1111}, C 2 = {0, 100, 101, 110, 111}, C 3 = {00, 01, 10, 110, 111}. 



The bounded degree sequences counted in Definition [4] are (1, 1, 1, 1), (1, 1, 2), (1, 2, 1). The 
proper words in Definition [5] are (111), (HO), (101). The canonical trees (Definition [3]) are the 
following: 
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1.5. An observation. When evaluating ft(r), according to the Definition [2] of Huffman se- 
quences it suffices to investigate in which way a solution counted by f t (r — t + 1) can be split. 
Let St(r) denote the set of all sequences counted by ft(r). Generally, (ao, ai, . . . , a*, . . . , a{) 
can be split into (ao, a\, . . . , — 1, + 1, . . . , ai), whenever > 0. Starting from a complete 
set of solutions, that is St(r — t + 1), one only needs to branch each sequence at the last two 
positions, in order to compile a complete set of solutions, St{r). The reason for this is that all 
elements of St(r) obtained from branching at any of the earlier positions will be obtained from 
another member of S t (r — t + 1) by branching at the last two positions. Before we generally 
prove this let us look at an example. Let us determine 5*2(6), starting from the three elements 
of S 2 (5) = {01112, 0104, 0032}: 

0111|2 — ► 0111|12, 011|12 -> 011|04, 010|4 ->• 010|32, 003|2 003|12, 00|32 00|24. 

There is no need to consider 

0|1112 ->- 0|0312, or 01|112 ->• 01|032 or 0|104 ->- 0|024, 

as these are obtained otherwise. 

To see this generally, let us consider the step from f t (r—t+l) to ftij): If (a , ai, a 2 , a 3 , . . . , a t ) G 
St(r — t + 1), i.e. J2\=o a>i = r — t + 1, with ai > 0, we need to check if (a , ai, . . . , a» — 1, a i+ i + 
t, a i+2 , . . . , ai) G S t (r) will be reached by branching an appropriate element of S t (r — t + 1) in 
any of the last two positions only. 

Note that (a , ai, . . . , a, — 1, a i+ i+t, a i+2 , ■ ■ ■ , a/_i + l, ai — t) G S t (r — Hence one reaches 
(a , ai, . . . , aj — 1, a i+ i + t, a i+2 , . . . , a{) G S t (r) by branching in the last two positions only. 

r-l 

We may also observe that this gives a trivial upper bound of ft{r) < 2'- 1 . 

Using the above observation of branching at two positions only, Narimani and Khosravifard 
[2"3] describe a recursive algorithm to create all codes counted by ft{r). 

The first terms of the sequence f 2 (r) are: 

t = 2 : 1, 1, 1, 2, 3, 5, 9, 16, 28, 50, 89, 159, 285, 510, 914, 1639, . . . 

The values of /s(r) are zero, whenever r is even. The nontrivial part of the sequence for odd r, 
that is gz{n) starts with 

t = 3 : 1, 1, 1, 2, 4, 7, 13, 25, 48, 92, 176, . . . 

(see also [28]). For general t, the sequence is only non-zero for r = l + (t— l)n. For convenience 
one examines gt(n) = /t(l + n(t — 1)) instead, see Definition |1J For reference purposes we list 
the first values of the sequences gt(n) in Table [lj In these tables one can easily notice the 

r—l — 

observation above, gt(n) = ft(r) <2 f - 1 = 2 n . 

The sequences g 2 (n), gz{n) and g±(n) have been included into the OEIS (sequences A002572, 
A176485 and A176503). (The latter two sequences only after the appearance of the Paschke et 
al. paper [25].) 

1.6. The growth of ft{r). As far as we are aware of, Bende (1967) [1] and Norwood (1967) 
[24] were the first to examine the sequence f 2 {r), and they observed the connection to coding 
theory and trees. (Mine's 1958 paper [22] was, of course, earlier but had less interest in the 
sequence itself.) Bende asked about the asymptotic growth. Erdos in his review of Bende's 
paper (Mathematical Reviews) also wrote it is "desirable" to know the asymptotic. 
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t 


1 


2 


3 


1 


5 


6 


7 


8 


9 


10 


1 1 


12 


13 


14 


15 


16 


17 


18 


19 


20 


2 


\ 


I 


I 


2 


3 


5 


g 


16 


28 


50 


89 


159 


285 


510 


914 


1639 


2938 


5269 


9451 


16952 


3 


1 


1 


1 


2 


4 


7 


13 


25 


48 


92 


176 


338 


649 


1246 


2392 


4594 


8823 


16945 


32545 


62509 


4 


1 


1 


1 


2 


4 


8 


15 


29 


57 


112 


220 


432 


848 


1666 


3273 


6430 


12632 


24816 


48754 


95783 


5 


1 


1 


1 


2 


4 


8 


16 


31 


61 


121 


240 


476 


944 


1872 


3712 


7362 


14601 


28958 


57432 


113904 


6 


1 


1 


1 


2 


4 


8 


16 


32 


63 


125 


249 


496 


988 


1968 


3920 


7808 


15552 


30978 


61705 


122910 


7 


1 


1 


1 


2 


4 


8 


16 


32 


64 


127 


253 


505 


1008 


2012 


4016 


8016 


16000 


31936 


63744 


127234 


8 


1 


1 


1 


2 


4 


8 


16 


32 


64 


128 


255 


509 


1017 


2032 


4060 


8112 


16208 


32384 


64704 


129280 


9 


1 


1 


1 


2 


4 


8 


16 


32 


64 


128 


256 


511 


1021 


2041 


4080 


8156 


16304 


32592 


65152 


130240 


10 


1 


1 


1 


2 


4 


8 


16 


32 


64 


128 


256 


512 


1023 


2045 


4089 


8176 


16348 


32688 


65360 


130688 



Table 1. Values of g t (n) for 2 < t < 10 and 1 < n < 20. 



The early 1970's saw a considerable number of contributions to the problem, such as Boyd 
[5], Even and Lempel [9], and Gilbert [13] . 

A trivial upper bound for the number of rooted canonical trees on \V\ vertices is 2 ^ 2 ') . A 
much more precise bound is the number of all trees. The number of binary trees on \ V\ vertices 
is determined by the Catalan numbers yj-j- ( 2 ^) = 0(A n n~ 3 ^ 2 ) and the number of non-isomorphic 
trees is asymptotically ~ C 2 C™n~ 5//2 , where C\ = 2.955 . . . and C 2 = 0.5349 . . ., see Otter [26J. 

A trivial lower bound comes from observing that Definition [4] shows that f2(r) > F r , where 
F r is the number of ways of partitioning r — 1 into ones and twos. It is known that this is the 
r-th Fibonacci number so that /2( r ) — 0.4472 x 1.618329 r (for sufficiently large r). Similarly, 
a lower bound on ft(r), can be obtained by partitioning r — 1 into l's, 2's ... and t's. By 

means of the generating series of x _ z _ z \ i arid determining a real root of the equation 

1 — z — z 2 — — z f = near 0.5 the corresponding generalised Fibonacci number F tyT can be 
shown to be about c t pl, where p t ~ 2 — t^-t, and c t is a positive constant. In the next section 

1 2 

we will refine an analysis of this type considerably. 

Boyd (1975) [5], Komlos, W. Moser and Nemetz (1984) [2D], Flajolet and Prodinger (1987) 
[TT] . all independently, gave an asymptotic: 

Mr) ~ R P r , 

where R ~ 0.14185, p ~ 1.7941471. Boyd and Flajolet and Prodinger additionally gave an 
error term: / 2 ( r ) — Rp r + 0(p r ), where Boyd proves p = 1.55, and Flajolet and Prodinger 
proved that this even holds for p = y. Boyd, and Komlos, W. Moser and Nemetz also study 
the case of more general t. As noted before: as ft{r) is positive only for r = 1 + n(t — 1), one 
examines Qt{n) = /t(l + n{t — 1)) instead. 

In particular Komlos, Moser and Nemetz observed that gt{n) ~ K t p™ with p t — > 2, as 
t increases. Flajolet and Prodinger [TT] also refer to other areas, where the sequence /^(f) 
naturally occurs. 

Building upon [TT], but not being aware of [5] nor [20], Tangora (1991) [31] generalised the 
results to prime values of t. 

Another string of references follows from Gilbert's experimental observation that /2O") ~ 
0. 148(1. 791) r , see [T3l . The observation was based on the values for r < 30, and is relatively 
close to the true asymptotic /2( r ) ~ 0.1418 ... (1.7941 .. .) r . However, these approximations 
have been referred to in the more recent coding literature, see for example [27], [29], [TJ, [231 . 
UE mid [H]. 
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More recently Burkert (2010) [7] and Paschke, Burkert, Fehribach (2011) [2E] studied f 2 {r) 
and /t(r) respectively, unfortunately with inferior results and unfortunately being unaware of 
the earlier wor k0 

In the results that we describe in detail in the next section, we state a rather precise asymp- 
totic formula, with two main terms, and an error term, which is exponentially smaller. As an 
example, one finds an approximation 

f 2 (n + l)^Rp n+1 + R 2 p n 2 +1 , 

with 

p = 1.794147187541686, p 2 = 1.279549134726681, 

R = 0.1418532020854094, R 2 = 0.0612410410312. 

Let us evaluate / 2 (50) « 699427308155.394.... While the error analysis of Theorem [7] 
(below) gives an error of |/ 2 (50) - (Rp 50 + R 2 pf)\ < 36.6 • 1.123 50 < 12092, the absolute 
error is much smaller and, in this case, the above approximation predicts the correct value of 
/ 2 (50) = 699427308155. 

1.7. A note on algorithms and complexity. The question of the complexity of the evalu- 
ation of f 2 {r) is raised in Even and Lempel [9]. They give an algorithm to determine f 2 {r) in 
0(r 3 ) additions. This appears to be the only algorithm with analysis of its complexity. They 
also state another algorithm to give a complete list of the f 2 {r) elements. 

Huffman, Johnson and Wilson [15] describe another algorithm to give a complete list. 

A tree based algorithm for generating the binary compact codes is described in [TS]. Narimani 
and Khosravifard [23] describe a recursive algorithm to create all i-ary codes of length r by 
those of length r — t + 1. 

2. Results 

In the following, a tree will always be a t-ary rooted canonical tree. The set of t-ary canonical 
trees is denoted by T . The number of inner vertices (non-leaves) of a tree T is denoted by n(T). 
Setting c n := gt{n) to be the number of trees T G T with n inner vertices, we are interested in 
the generating function 

n>0 TeT 

This generating function can be computed explicitly: 



The oversights some decades ago can be easily explained due to the fact that the results were discovered 
independently by people with interests in number theory, coding theory or graph theory. Boyd's paper [5] has a 
number theoretic title, the Komlos et al. paper [20] a coding title and appeared in a less accessible journal. Using 
standard tools such as MathSciNet, Zentralblatt, Google Scholar, Online Encyclopedia of Integer Sequences 
(OEIS) we found a considerable corpus of literature referring to the result that ft(r) ~ K t ■ p r t . 
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Theorem 6. Setting [k] := 1 + t + t 2 + 



F(q) 



D 

3=0 



t k 1 , we have 



n r 

i=l 



j=0 



1=1 



9 1 ' 



Using the generating function, we can give a very precise asymptotic expression for c n . In 
view of the numerous asymptotic approximations we would like to point out that this is the 
first result containing two main terms and an explicit error term. 

Theorem 7. For t > 2, the following holds: 

c n = g t (n) = Rp n+1 + R 2 p n 2 +1 + R^e x {t,n), (2.1) 

Here p > p 2 > r 3 and R,R 2 ,Rs are positive real constants to be specified below, and depending 
on t. Here and below, Ej(. . .), j = 1, . . ., denote real functions with \ej(. . .)| < 1 for all valid 
values of the respectively indicated parameters. 
For t > 16 we have 

3t 2 



1 



t 



19t 



P 



P2 = l + 



2 t+l 

log 2 



22i+3 



23t+6 



24 0.28t 3 , . 



t 



log2 -log^2 + 41og d 2 + 31og 2 2 + 61og2 



2t 2 



24t 3 



+ 



2 log 4 2 + 54 log 3 2 - 27 log 2 2 - 6 log 2 0.26 



+ 



f 5 



r 3 
R 

R 2 
Rs 



l + 
l 



+ 



log 2 
t 

t - 2 



48t 4 
log 2 - log 2 2 
2f 2 ' 

2t 2 + 3t-5 9t 3 + 45t 2 + 20t - 68 
+ 



-e 3 (t), 



1 

4t 
5t 4 



2*+5 2 2t + 7 
41og2 + l 0.77 



23t+10 



50 ■ 2 U 



8t 2 



t 3 



e 5 (t), 



(2.2) 
(2.3) 

(2.4) 
(2.5) 

(2.6) 
(2.7) 



For 3 < t < 15, (2.1) holds with (2.2), (2.5), (2.6) and the values for p 2 ,r% and R3 given in 
Table H 

For t = 2, (2.1) holds with (2.6) and the values for p, p 2 , r^, R and R3 given in Table^ 

For simplicity the functions Ej can be thought of as 0(1) terms. Some of our proofs indeed 
depend on explicit values of the error bounds. For this reason we had to compute absolute 
O-constants in any case, and decided to include these in the statement of the theorem. 

The asymptotic result focusses on the first and the second exponential terms p n+1 and p 2 +1 
and no effort has been made to improve the error term r^: note that for large t it is not much 
smaller then the second order term p 2 +1 . For Tabl e |2~| the values r 3 have been improved by a 
computer calculation in comparison with Equation (2.4), also leading to a stronger value of the 
constant R3 in comparison with (2.7). In principle, this type of improvement is possible for 
any fixed t > 16 as well. 
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999938935019296* 


1 


052819586914068* 


1 


044 





1250420050254539* 


0.01673722535920120* 


80 


6 


14 


1 


999969474502513* 


1 


049072853620226* 


1 


042 





1250229006766309* 


0.01568876914448585* 


43 


3 


15 


1 


999984739115025* 


1 


045822904924682* 


1 


040 





1250124013324635* 


0.01476426249364319* 


39 






Table 2. Values for small values of t. Starred (*) entries correspond to values 
satisfying the asymptotic estimates of Theorem [7| The values could be given 
with much higher precision, there is some uncertainty about the last digit. 



The asymptotic expansions of p, P2, R and R2 can always be refined by further iterating 
the fixed point equations in the proof of Proposition 10 So for fixed k, we could refine the 
estimates for p and R to a precision of t k 2~ tk and the estimates for p 2 and R2 to a precision of 
t~ k . 



3. Generating Function 
This section is devoted to the proof of Theorem [6] 

Proof of Theorem [6| In the proof of the theorem, we will actually consider more refined statis- 
tics in order to derive a functional equation for a more general generating function. 

The height of a vertex in a rooted tree is defined to be its distance from the root. So the 
root has height 0. The height height(T) of a tree T is defined to be the maximal height of its 
vertices. 

For a rooted tree T, we set m(T) to be the number of leaves of maximum height of T. 
We will derive a functional equation for the generating function 

G(q,u) = J2 ( l n{T)um{T) i 
TeT 

i.e., u counts the number of leaves of maximal height and q counts the number of inner vertices. 
By definition, we have F(q) = G(q, 1). 

To derive the functional equation for G(q,u), we partition T with respect to the height and 
consider 

G k (q,u)= <f (T) « m(T) - 
TeT 

height(T)=fc 

Obviously, we have 

G(q,u) = ^2G k (q,u). 

k>0 
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A tree T of height k corresponds to exactly m(T) trees T-, j e {1, . . . ,m(T)}, of height 
k + 1: Tj arises from T by replacing j of the m{T) leaves of maximum height by vertices with t 
attached leaves. On the other hand, all trees T' of height k + 1 are uniquely described by this 
process. 

Thus we have 

m(T) 

MT)+j..jt 



G k+l (q,u)= Yl q 



TeT j=l 

height(T)=fc 



^ 1 - gtt* 

height(T)=fc 



(GA(g,l)-G fc (g,gw*)). 



1 — gu* 

We have Go(q,u) = u, so summing over all k > yields 

6'W/. „) - „ = _^(G( g ,l) -G{qM))- 13.2) 
1 — g?r 



The generating function G(q,u) is certainly convergent for \u\ < 1 and |g| < 1/2, as can be 
seen from (3.1 ). 

We now keep q with |g| < 1/2 fixed and consider everything as a function of u with \u\ < 1. 
We use the abbreviations = qu l /{l — qu l ) and = G(q,u). We rewrite the functional 
equation (3.2) as 

g (u) =u + h(u)g(l) - h(u)g(quF). 

By iteration, we obtain 

g(u) = o fc (u) + 6*(u)</(l) + c fc (M)^(g [fc+1] M* fc+1 ), 

o*(«)=x)(- i ) i 9 Wt ** , n /i (9 w,i< ')' 

j=0 i=0 

k(u) = J2(-iyf[h(q^), 

k 

Cfc ( M )=(-i) fc+i n% [ v i ) 



i=0 

for fc > 0. As \h{u)\ < JqLy < 1 holds for all \u\ < 1, the limits 

oo j—l 



a(u) = ^(-l)VV Y[h{q^vF 

j=0 i=0 
oo j 



%)=^(-iyn% [ 

j=0 i=0 

exist and we have lim^oo Ck{u)g{q k+l u th+1 ) = 0. 
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Thus we obtained 

g{u) = a(u) + b(u)g(l). 

Setting u = 1 yields 



F(q) = G(q,l)=g(l) " ( ' 



1-6(1)' 

4. ASYMPTOTICS 

We will use the following notations in order to work with the generating function F: 

m 



□ 



1 - q\3\ ■ 



N K (q)= £ (-WHIM, D K {q)= £ (-1)^^^)' 

0<k<K j=l 0<k<K j=l 

k k 

N(q) = V* 1 II fM D{q) = ^(-l) fc J] f)(q). 

0<k j=l 0<k j=l 

The quantities have been defined such that F(q) = N(q) / D(q). 

We intend to work with the finite sums Dx and Nk for fixed values of K, so we need upper 
bounds for the approximation errors. 

Lemma 8. Let K > and \q\^ K+1 ^ < 1/2. Then 



1 - 2|g|I x + 1 l Al i _ | g | 

^ - £ f i'rS n ™r) |s|EiLiM ' (4ib) 



.7 = 



These bounds are decreasing in t and increasing in \q\. 

Proof. As | fj(q) | < fj(\q\) and is decreasing in j, we have 

oo K k 

\D(q)-D K (q)\<J2UfM) U /i(M) 

k=K j=l j=K+l 
K oo 

<n/i(H)E^(H)*" K 

i=i fc=_ft: 



i-/jc+i(I?d y 



which, upon inserting the definition of fj, yields (4.1b). The approximation bound (4.1b) for 
the numerator follows along the same lines, we get an additional factor q^ K \ □ 

We will also need estimates for the derivative D'(q): 
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Lemma 9. Let t > 30 and q G C with 1/2 < \q\ < l/r 3 , where r 3 is defined in (2.4). 
Then 

\D\q)-D'M\<^. 

Proof. Let q = l/z with r 3 < \z\ < 2. Then f^q) = f^l/z) = and = 1/\ Z W - 1| < 

1/(^3' — 1). By estimating the relevant power series, we get 

1 



r 3 -l > 



rf ] ~ 1 



.[3] 
3 



2f 
exp 

1 > 2', 



1 + t) log 1 + 



log 2 log 2 -log 2 2 



t 



2t 2 



1 > 1, 



1 _ 2* 2 +*/ 2 



(4.2a) 
(4.2b) 



We have 

\d\i/z) - d' a (i/z)\ < \z\J2TI /i(VM) ( E r 



fc=4 i=i 



(VM) W 



fc=4 

00 

I 

< 



-T — 



2-l+t(fc-l) /2+(fc-3)t 2 
1 



i=2 



fc=4 



2(fc-3)t 2 +t(fc-l)/2-4 



2 2* 2 ( fc -3) 

fc=4 



< 



2* 2 ' 



□ 



The exponential growth of the coefficients c n of F(q) is directly related to the dominating 
pole 1/p of -F(q'). So we now investigate the location of the poles of F(q). 

Proposition 10. Let t > 2. Then there are exactly two poles 1/p and l/p 2 of F(q) with 
\q\ < l/ r 3; where r 3 has been defined in (2.4) (or Table^for t E {2,3}). 

Both 1/p and I/P2 are simple poles of F{q). The dominant pole 1/p of F(q) is asymptotically 
given by (2.2) (or Table^for t = 2). 

The residue of F(q) at 1/p is —R where R is asymptotically given by (2.5) (or Table^for 
t = 2). 

The pole l/p 2 is given by (2.3) (or Table^for 2 <t< 15), the residue of F(q) at l/p 2 is 
—R2, where R 2 is given in (2.6). 
Finally, we have 

\F(q)\<5t 4 (4.3) 

for all q with \q\ = I/7-3. 



The proof of Proposition 10 relies on rewriting the equation D(q) = into two fixed point 
equations, one for each of the two poles. Inserting preliminary bounds into these fixed point 
equations improves these bounds. This method is known as bootstrapping. The first pole 
is an attracting fixed point of the first fixed point formulation, whereas the second pole is a 
repellent fixed point of this first fixed point formulation. So we need to take inverses in order 
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to turn the second pole into an attracting fixed point. However, inversion involves extracting a 
(t + l)-st root, so several branches occur. Additional inequalities are required in order to decide 
which branch to take. We repeatedly use power series estimates in order to get the required 
inequalities. In order to sharpen these estimates, we assume that t > 30. 

Proof. In the proof of this proposition, some more functions £?(• • •) occur. We first allow 
complex values for the Ej(. . .), it will later turn out that those occurring in Theorem [7] have 
only real values. 

In the following, we consider the case t > 30. Assume that 1/z is a pole of F(q) with 
\z\ > 1 + a/t for some 2 > a > log2. As N(q) is holomorphic for |g| < 1, cf. Lemma |8j 1/z 
must be a root of D(q). Using K = 3, we get 

= 1- — + 1 : , , 1 : + (D(l/z) - D 3 (l/z)), 



which is equivalent to 



z-l z-l z t+1 - 1 

1 



z t+l _ I 



Taking absolute values, (|4.1b|) yields 
2 

We have 



\z\ < 12 - z\ < 



+ (z-l)(D(l/z)-D 3 (l/z)). 



1 + 



I [2] 



|[3]_1 1_ 



I [2] 



> 1 + 



t 



= exp ((t + 1) log 

a - a 2 /2 



> exp ( (t + 1 



a 

2*2 



t 



a 

2t 2 



> exp a 



= exp ( a + 

for b = a - 31a 2 /60 > 0. By d4~2al) and (gjbl, we have 



> e a 1 + 



|[3]_1 1 



1 



|z|W-l 



< 



1.00001 



Consider now the case a = log 2. Then (4.5|), (|4.6|) and (|4.7|) yield 

1 



z < 



1 + 



■21, 



1 



l.oooon 



< i 



4 
5? 



We conclude that \z\ > 1 + — • So using now a = 4/5, (4.5), (4.6) and (4.7) yield 



\z\ < 



1 



,4/5 



1 + 



1.00001 \ 



J 



< 0.82 



and therefore \z\ > 1.18. Inserting this and (|4.7|) in (|4.5|) now yields 
2 - | : I < 12 - : | < — | 1 + 



1.18* +1 - 1 



1.00001\ 0.86 



J ~ 1.18* 



We conclude that z = 2 + 0(1.18 *). We now rewrite (4.4) as 

1 



2 - 



z t+i _ i 



0(2- 



(4.4) 



(4.5) 



(4.6) 



(4.7) 



(4.9) 
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Inserting z = 2 



0(1.18 *) in the right-hand side of (4.9) yields 
1 „ 1 



(2 + 0(1.18"*)) m - 1 



- (1 + 0(U.18-*)) . 



We now repeat the process: We insert this estimate in the right-hand side of (4.9) and get a 
better estimate. After a few iterations (and taking care of all implicit constants), we finally get 
( 2.2[ ). Inserting the lower and the upper bounds of (2.2) into D^{q) (and taking into account 
D(q) — Ds(q)), we see that D(q) changes sign within the interval, so there is certainly a root 
l/z of D(q) fulfilling ggb. 



(4.10) 



Inserting this asymptotic expression into D'(q) and using Lemma [9J we get 

\D'(l/z) +4| < 1.04*2~* 

for t > 30. This shows that there is at most one zero of D(l/z) within the bounds of the 
asymptotic expression (2.2): if there were two, say 1/zi and l/z2, then 



1 

Z-2 



1 

Zl 



D(l/z 2 )-D{l/z 1 )+A 
(D\q)+A) dq 





-3 







< 1.04t2~* 


1 


1 








Z2 





which implies 1/zi = 1/zi. Here, we integrate over the straight line from l/z% to l/z 2 . The 
estimate (4.10) also shows that there can only be a simple root. Thus we have shown that the 
only root l/z of D with \z\ > 1 + log2/t is a simple zero with z as in (2.2). The residue (2.5) 
follows upon inserting (2.2) into N(l/z)/D'(l/z). Note that this also shows that the dominant 
zero of the denominator does not cancel out against a zero of the numerator. 

Now assume that |Z)(l/z)| < 1/t 3 holds for some z with < \z\ < 1 + log 2/1 Inserting 
these bounds into (4.5), we get 

log 2 



z- 2 < 1 



41og 3 2-31og 2 2 + 121og2 1.5 . . 
H h —mt,z) 



--: r . 



(4.11) 



t 12t 2 t 3 

The intersection point with positive imaginary part of the circle of radius 1 + log2/t centred 
at the origin with the circle of radius r' centred at 2 is denoted by £. We obtain 



^=1 + 
In particular, we have 

and 



4 log 2 + i x / f log d 2-4 log^ 2 + 1 6 log 2 
it 



2.23 



E 7 (t). 



1|<|C-1|< 



1.14 



arg(z)\ < | arg^| < | log^| < 



1.18 



(4.12) 
(4.13) 



As | D (l/z) | < 1/t 3 , we have (after multiplication with z — 1) 







z-2 



z t+i _ l 



2.01 . . 
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Solving for z t+l yields 



2.01 

i 3 



As z = 1 + ^e 9 (t, z) by (4.12), we obtain 



yt+l 



1 19 

2 + — e 10 (t,z). 



We conclude that 



for some integer £ with 



exp 



2£ni 



+ 



t 



log 2 



1.19 



t+i 



< 



t+1 t+1 



< In particular, we have 



(4.14) 



2£tt 



argz 



+ 



t+1 t+1 



Slog 1 



1.19 



which, in view of (|4.13|), implies 

1 



cxp 



t + 1 



log 



0. Thus ( |4.14 ) simplifies to 
1 , lo S 2 



1 19 
2+ — qsflC* 



t 



+ 



1.63 

~£2~ 



£n(M)- 



We may now repeat the argument a few times to finally obtain 



1 + 



log2 log2-log 2 2 t 41og 3 2 + 31og 2 2 + 61og2 , 3.45 



+ 



t 2t 2 24t 3 

Thus we have \z\ > r 3 . We have therefore shown that 

1 



+ 



\D(q)\ > 



t 3 



for 



q\ = l/rs. 



So we now assume that D(l/z) = for some z with r 3 < \z\ < l + log2/t. Repeating the above 
steps with 1/t 3 replaced by gives the slightly better bound z = p 2 with p 2 as in (2.3). 

Inserting the real upper and lower bounds implied by (2.3) into D 3 (q) and taking the error 
D(q) — D 3 (q) into account shows that the sign of D(q) changes sign in this interval, so there is 
a real root l/z = l/p 2 of D(q) fulfilling (2.3). 

For the z in (2.3), we get 

2 



D'(l/z) 



log 2 



t 2 + 1.07te 13 (t, z) 



which implies that there is exactly one simple zero l/z of D(q) with z fulfilling (2.3). By the 
same argument as above, this is the only zero l/z with r 3 < 



N{1/ z) / D'(l/ z) finally yields the residue given in (2.6). 
We already know that 1-0(5)1 > 1/t 3 for all q with \q\ 



< 1 + log2/t. Computing 

l/r 3 . We also get \N(q)\ < 5t. This 
yields ( |43"| ). 

We now turn to the case 2 < t < 30. Here, the asymptotic estimates can be replaced by 
concrete numbers. All assertions have been proved using the interval arithmetic built in in 
Sage [30]. First, we computed an estimate analogous to (4.11). The corresponding neighbour- 
hood of 2 is subdivided in squares. Each of these squares is intersected with its image under 
(4.4) and the union of its images under the corresponding analogon to (4.14). If this intersec- 
tion is empty or the square has no point of absolute value at least r 3 , the square is discarded. 
Otherwise, the square is replaced by the smallest square containing the mentioned intersection. 
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If this does not yield sufficient progress, the squares have been "bisected" into four squares. 
After a certain number of operations, there are only two small regions which might contain a 
root. Estimating the derivative, we see that there is at most one root in each of these regions. 
As it is suspected that these roots are real, the real bisection method is employed to determine 
the roots with higher precision. The approximation errors D(q) — Dx(q) can also be handled 
by adding the corresponding interval in the interval arithmetic. □ 

We are now able to prove Theorem [7} 

Proof of Theorem^ This is a consequence of singularity analysis [TO], cf. also |12j . 

In this simple case, this also follows from Cauchy's integral formula and the residue theorem 
(and Proposition 10): 

J_ I An-- gp"+l - P- 

2vri .7|,[ = i /nj q T ' 



e 1 (t, n)5t 4 r™ = — f dq = -Rp n+1 - R 2 p n 2 +1 + c n . 

□ 
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